Data Analysis is the process of inspecting, rearranging, modifying and transforming data to extract useful information from it.
Maintaining the integrity of the data is crucial for a data analyst to procure accurate and appropriate analysis. A credible data analyst should have the skills to analyse the statistics of the data and turns the data into actionable insights. Improper analysis always distorts the scientific findings and lead the readers to believe in a wrong notion. Inappropriate analysis not only limits into numerical data but also affects non-statistical data too if the analyst has integrity issues.


This section covers the various methodology analysts can use to answer each research question and research hypotheses. While attending to a research question analysts must describe the descriptive statistics used while answering the question.
While analysing research hypotheses describe in details about the inferential statistic used to investigate the hypotheses. Analysts can also give the formula for the statistics provided they are simple statistics such as mean, median, percentage, etc. However, advanced statistics such as ANOVA is too complicated to mention.
As usage of statistics differs as per the necessity of the research studies, it’s crucial for every analyst to have proper knowledge in all statistical methods. A researcher has to examine each research question and hypothesis individually and assign appropriate statistics to it.


All research questions are answerable in descriptive statistics such as in percentage or mean. When analysts want to know about the total number of participants who gave one particular answer, he/she must use percentage. A percentage is ideal when the respondent falls under distinct categories such as male or female, employed or unemployed, vegetarian or non-vegetarian etc. When the data falls under such discrete categories analysts can also report frequencies under it. Suppose there are 100 cases in a sample and analysts want to know how much people fall under a particular group, in such a situation usage of percentage is vital.
When analysts want to know a typical response of all the participants, then he/she can use mean in the report. Mean is use when the reactions are continuous. Data that involves numbers which continue from one point to other are reported using mean. Example of some continuous data is ages of participants, scores of students in exams etc.


When it comes to data analysis, every field holds its accepted practices. However, the standard rules of data analysis are based on two factors:

  1. It could be Quantitative or Qualitative variables.
  2. An assumption about the population: The assumptions could be random distributions, sample sizes, independence etc.

In some cases, analysts are allowed to use unconventional norms as well provided they clearly state the reason behind using the alternative standards. Apart from that, the analyst must also indicate how the new method could bring a significant difference from traditional methods.


The primary purpose of using conventional approach while data analysis is to establish a standard of acceptability for statistical significance. Analysts must also discuss the importance of attaining statistical relevance and whether their objectives are met using the conventional approach.


No matter how much sophisticatedly a statistical analysis is conducted, it cannot correct poorly defined objective outcome measurements. Analysts who lack skills in identifying the objective and outcome often left the report filled with misinterpretation, which often misleads the readers.


The main motto of data analysis is to reduce the statistical error. Issues often faced by analysts during the process include:

  • Filling the missing data
  • Altering the data
  • Data mining
  • Creating graphical representations of data



There comes a time when analysts have to decide how to present the derived data in such a way that it would be more impressionable. To do that, investigators choose how much, why, when and whom to show the data, Even sometimes analysts manipulated data, they must also keep a record or a paper trail regarding why and in how much intensity the data are managed for future reviews.


Sometimes environment or context plays a significant role while procuring the data. The answers of the respondents often differ during one-on-one interviews and focus groups. In the focus group, the number of participants comprises a large number, and it often changes the repose of a person. When conducting data analysis, the researcher must keep the environmental factor into account as well.


Analysis of an analyst may differ based on the process of recording the data.
Various types of methods include:

  1. Data Collected in audio or video format for transcribing later
  2. Data collected by a researcher or it may be a self-administered survey
  3. Is it a close-ended survey or open-ended
  4. Notes were taken by the researcher, or accepted by the participants and later submitted to the researchers.


Raters are staff researchers who analyse text materials during content analysis. Now while examining text materials, some evaluators tend to take comments as a whole while others dissect the texts into words, clauses, sentences etc. To maintain the data integrity, its essential from the part of raters to eliminate inconsistencies in analysis among themselves.


While working on inductive techniques, it’s important that the analysts are adequately trained and supervised by skilled personnel. In content analysis, raters assign topics to text materials. Now if the evaluators don’t perceive the material as it is supposed to be, then it will spoil the data integrity. Due to lack of training, the assigning skills of one staff researcher might differ tremendously from another researcher. To combat the challenges organisations must draft a proper protocol manual, train their analysts periodically and monitor them routinely.


Reliability and validity of the research are the most crucial aspects of a study, and an analyst must know that whether he/she is working on quantitative data or qualitative data. Coders often re-code the same data over in the same way over a period. Such actions tamper data reliability and invalidate the data.
All researchers must fully aware of the potential for compromising the integrity of data whether the method includes statistical or non-statistical. The most statistical analysis focuses on quantitative data, but there are a lot of analytic procedures which concentrates mostly on qualitative materials such as thematic, ethnographic analysis, and content analysis.
Whether an analysts study quantitative or qualitative phenomena, he/she have to utilise a range of tools to test hypotheses, analyse the behavioural pattern and reach a conclusion. Improper implementation and interpretation skills may result in a compromise data integrity.
Please like, comment , follow and share the Article . It will motivate me to write more .