요약방법본 연구는 2019년 12월 3일부터 2020년 1월 7일까지 한국에서 실시되었다. 18개 헬스케어 관련 기관이 설문조사에 참여하였다. 회신 받은 172건의 설문지 중 164건의 설문지가 최종 분석에 활용되었다.
AbstractObjectivesA growing interest in big data for cancer research and treatment motivated us to investigate the demand for it by healthcare users.
MethodsThe survey was conducted from December 3, 2019 to January 7, 2020, in Korea. Respondents from 18 healthcare organizations participated in the survey. Among the 172 questionnaires received, 164 responses were used for the final analyses.
ResultsThe majority of respondents showed a high awareness of big data related to cancer (n=148, 90.2%). However, only about half of the respondents were aware of how big data related to cancer is used (n=85, 51.8%). Among the respondents with experience using big data (n=83, 50.6%), more than half used big data only about once a year (n=43, 51.8%). The majority of respondents had particularly high demand for big data associated with “chemotherapy” (n=154, 94.5%), followed by “cancer type at diagnosis status”, “clinical stage,” and “recurrence.” The main considerations for releasing cancer big data were “trustworthiness of data” (63.2%), “provision of valuable data” (58.9%), and “improvement of data accuracy” (55.8%).
ConclusionsThe study identified that even though respondents have a high awareness of and demand for big data related to cancer, it is not being sufficiently utilized at present. To increase the utilization of big data for cancer research and treatment, it is necessary to consider its purpose and how to make it available in line with the specific requirements of the health-care industry, hospitals, and academia.
INTRODUCTIONBig data in healthcare refers to electronic health datasets so large and complex that they are difficult to manage with traditional software, hardware, or common data management tools and methods. Big data in healthcare is overwhelming not only because of its volume, but also because of the diversity of data types and the speed at which it must be managed [1–3].
Besides structured large-capacity data, the scope of big data has continually expanded in recent years to include unstructured information [1–6]. In particular, due to the development of information and communication technology and the change in the health paradigm that follows new technology, the amount of data in the healthcare field is rapidly increasing. Along with this, there is a growing interest in analyzing and using big data in the field of healthcare.
The potential of big data in healthcare relies on the ability to turn high volumes of data into actionable knowledge for precision medicine and decision-making. Big data analytics in healthcare is evolving into a promising field for the provision of insights from very large data sets while reducing costs [3,7]. Especially in the case of cancer, there is a large amount of data related to the diagnosis, decision-making, treatment, and prognosis of patients. There is high interest and unmet need for the utilization of such big data related to cancer in hospitals, industry, and academia.
Korea has a well-established system for utilizing large amounts of medical data accumulated through electronic medical records, and various attempts have been made to utilize this big data [8–12]. However, there have been no discussions on various demands or detailed methods to utilize big data for cancer [13–15]. Therefore, the purpose of this study was to investigate the level of awareness and utilization demand of big data for cancer among workers in the academic, medical, and healthcare industries.
METHODSStudy population and data collectionThe target population of this study comprised individuals from tertiary hospitals, research institutes, pharmaceutical companies, contract research organizations, and academia in South Korea. The inclusion criteria were adults aged 19 years or older with the ability to understand a questionnaire, targeting students majoring in statistics, professors, researchers, data scientists, data managers, health information managers, and nurses. Those who refused to participate in this study were excluded. As a result, a total of 300 questionnaires were distributed. Data were collected either through paper surveys in face-to-face interviews or through questionnaires sent via e-mail. The responses were analyzed at the Yonsei Cancer Center, Korea.
EthicsParticipation in this study was entirely voluntary, and anonymity was guaranteed. All participants agreed that the results of the survey may be used for research purposes.
The method was approved by the Ethics Committee of the Yonsei University College of Medicine, Seoul, Korea
Questionnaire itemsWe developed questionnaires to investigate the demand of users for big data related to cancer in healthcare contexts. Survey items were developed with reference to a previous study [16,17].
The questionnaires were originally written and conducted in Korean. The questionnaires consisted of five sections: (1) awareness of big data related to cancer; (2) big data usage status and purpose of use; (3) demand for the release of big data to the public; (4) utilization of big data; and (5) basic characteristics of the respondents. The total number of questions to which the respondents could reply was 18. The number of question responses differed because the questionnaire included some conditional questions (Appendix 1).
Statistical analysisCategorical data were summarized as frequencies and percentages (%). A chi-square test analysis was performed to test for differences in proportions of continuous and categorical variables between two or more groups. Statistical analysis was performed using the SPSS version 25.0 (IBM Co., Armonk, NY, USA). p-values<0.05 were considered statistically significant. To ensure the clarity, precision, and accuracy of the results, the cases where respondents did not answer a sufficient number of questions, as well as the nonresponding cases, were excluded from the analyses.
RESULTSCharacteristics and knowledge of respondentsWe conducted the survey from December 3, 2019 to January 7, 2020. A total of 300 questionnaires were distributed. A total of 172 questionnaire responses were received, and 164 responses were used for the final analyses. Eight respondents who skipped more than one-third of the questions were excluded from the final analyses. The majority of survey respondents worked in the healthcare industry (n=92, 56.1%) or in tertiary hospitals (n=51, 31.1%). Most respondents were in the field of oncology (n=113, 76.4%; multiple response question) and had more than 10 years of working experience (n=81, 49.4%) (Table 1).
Table 1In addition, we analyzed the results of the baseline characteristics in each group. There was a statistically significant difference in the years of experience of the group participating in the survey (p <0.001). However, in discipline comparisons, only oncology and bioinformatics were statistically significant (p <0.001 and p <0.001, respectively) (Supplementary Table 1).
Awareness of big data related to cancerThe question regarding the awareness of big data was segmented into four levels (Table 2). Most respondents (n=148, 90.2%) reported that they knew about big data for cancer. More than half of the respondents re-plied that they “know very well” (n=9, 5.5%) or “know a little” (n=83, 50.6%). The awareness of usage of big data in healthcare showed contra-dictory results, as about half reported “well” (n=81, 49.4%), which was closely followed by those who reported “not very well” (n=74, 45.1%) (Table 2). Most respondents were seen to agree (“strongly agree” (n=107, 65.2%) and “agree” (n=53, 32.3%)) on the need to use big data in health-care services, business, and research. The utilization of big data in the academia group was the highest with “strongly agree” (n=19, 90.5%). Overall, respondents answered that the use of big data in current health-care contexts is very necessary. Furthermore, the experience of participants in using big data, in terms of “yes” (n=83, 50.6%) and “no” (n=81, 49.4%), showed similar distributions. However, compared with each group, there was a higher proportion of “no” responses for experience using big data in the tertiary hospital group (n=36, 70.6%).
Table 2Status of usage of big data and purpose of useSpecifically, respondents with experience using big data were surveyed on the frequency, area and purpose of big data usage. Overall, responses for “more than once a year” (n=43, 51.8%) was the highest, followed by “at least once a month” (n=18, 21.7%) and “at least once a week” (n=11, 13.3%). Notably, the academia group had the most active big data users, selecting “at least once a day” (n=4, 36.4%) (Table 3). In addition, the utilization of “cancer diseases” big data (n=56, 68.3%) was the highest among all research areas. Regarding the purpose of using big data, responses for “collecting and utilizing data regarding work (service projects, proposals, reports, etc.)” (n=53, 65.4%) was the highest, followed by “collecting data for academic research” (n=48, 59.3%), and “collecting data for new drug development” (n=15, 18.5%).
Table 3Demand for releasing big data related to cancerWe analyzed the results of the questions pertaining to the demand for releasing big data related to cancer. The majority of respondents revealed a particularly high demand for data related to “chemotherapy” (n=154, 94.5%; multiple response question), followed by “cancer type at diagnosis status” (n=139, 85.3%), “clinical stage” (n=138, 84.7%), and “recurrence” (n=136, 83.4%). Overall, respondents wanted information mostly about the treatment and clinical status of patients (Supplementary Table 2).
Regardless of whether they had big data usage experience or not, the respondents said that they required big data in the form of “Excel based file (xls, xlsx, csv)” (n=139, 86.9%), followed by “LOD (Linked Open Data)” (n=19, 10.5%), and “File (json, xml)” (n=11, 6.1%) (Table 4). Most respondents (n=131, 80.4%) revealed a willingness to pay to utilize big data, and the proportion was higher in the group with experience using big data related to cancer (n=70, 84.3%). A higher proportion of academia group respondents with no big data usage experience reported a lack of willingness to pay for big data related to cancer (n=7, 70.0%). The main purpose of big data usage was for “collecting and utilizing data regarding work (service projects, proposals, reports, etc.)” (n=54, 65.1%) among the respondents with big data usage experience. The main considerations for the release/provision of big data related to cancer were: “trustworthiness of data” (n=103, 63.2%), “valuable data” (n=96, 58.9%), and “improvement of data accuracy” (n=91, 55.8%).
Table 4DISCUSSIONThis survey was conducted to understand the awareness of big data for cancer from the perspective of users in the Korean healthcare environment. The main findings indicate that while most respondents recognize the necessity of using big data related to cancer in healthcare, business, and research, they are not using it frequently in practice. In addition, the majority of respondents indicated a particularly large demand for data on chemotherapeutic agents for treatments and other cancer-specific clinical information.
A higher percentage of people in academia, in comparison to other groups, do not know about big data, while the tertiary hospitals group reported the lowest rate of big data usage experience. Nevertheless, regardless of the institution, most respondents (over 95%) showed a willingness to use big data, especially in academia. The reasons for using big data were different among institutions, and as expected, the purpose of academic research received the highest response for the tertiary hospitals group (92.9%), with the healthcare industry tending to use big data for projects or reports (80.7%). Most respondents (80.4%) showed a willingness to pay for big data, and this willingness was highest among those in the healthcare industry, followed by tertiary hospitals and academia. The majority of responses were positive, indicating that respondents are aware of big data as an economically valuable resource.
Recently, there have been various attempts [11,16,18–20] to analyze the needs for big data in the field of healthcare and to encourage big data utilization [21,22]. While European countries are supporting initiatives to utilize big data in the field of oncology [7], there has not been much effort in Korea to make legal and institutional improvements to encourage usage of big data for cancer. The term “big data” has become ex-tremely popular globally in recent years and almost every field of research, whether it relates to industry or academics, is generating and analyzing big data for various purposes [23]. Particularly in medicine and healthcare, big data analytics integrates the analysis of several scientific areas such as bioinformatics, medical imaging, as well as medical and health informatics. The application of big data analytics aids the discovery of comprehensive knowledge from the huge amounts of data available [24]. Big data analytics in medicine and healthcare enables analysis of large datasets from thousands of patients, identifying clusters and correlations between datasets, as well as developing predictive models using data mining techniques [25]. The combination of data analysis and artificial intelligence technologies such as machine learning and deep learning allows for innovation in healthcare services such as patient-specific clinical decision support system utilization and precision medicine in real time [26,27]. Additionally, in new drug development, a field that en-tails enormous time and high investment costs, a partial solution to the cost-efficiency problem is expected through the utilization of accumulated big data on cancer in clinical trials for diagnosis, treatment, results, and prescriptions.
Korea has recently started to promote the use of accumulated big data in cancer research and treatment along with enhancement of privacy and utilization through the enactment of relevant laws. The release of generated big data relating to cancer is a current trend creating added value. It is very important to consider the requirements of various stake-holders in big data usage and related analyses.
This study has several limitations. First, there are important differences that may limit the generalizability of the study findings to the Korean population and may not reflect the opinion of all survey respondents. Second, responses may differ depending on the public policies or legal frameworks in other countries. Third, the respondents in this study were primarily from the healthcare industry and tertiary hospitals, and nota-bly, the respondents from academia were few. Moreover, those who were not familiar with big data did not participate in the survey, and thus, the participants willing to answer the survey could be those who have more knowledge or awareness about the topic and might bias the results. Last-ly, the cross-validation of the same perception among institutions was not compared, and thus requires further investigation. Hence, the results of this survey may only reflect the current situation in South Korea. De-spite these limitations, this study forms an important baseline for future studies.
The study was able to identify the high awareness and demand for big data related to cancer among the respondents through the survey. How-ever, compared to the high demand indicated in the survey responses, big data is not being well utilized. There will likely be more demand for big data utilization in the “new normal” era following the coronavirus disease (COVID-19) pandemic. To increase the utilization of big data for cancer, it is necessary to consider ways to release the information in accordance with the purpose and the finer details necessary for using such data in the healthcare industry, hospitals, and academia.
Furthermore, it is necessary to lay the foundation for an environment that can enhance consumer-centered data accessibility and establish detailed policies regarding the scope of using such data, and the methods and procedures associated with rapid and secure release of cancer related big data.
ACKNOWLEDGEMENTSThis study is supported by a grant from the Big data Center at the National Cancer Center of Korea (Grant number: 2021-data-we06).
REFERENCES1. Frost Sullivan. Drowning in big data? reducing information technology complexities and costs for healthcare organizations. 2015.
2. Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, et al. Big data: The next frontier for innovation, competition, and productivity. UK: McKinsey Global Institute; 2011.
3. Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health Inf Sci Syst 2014;2:3. DOI: 10.1186/2047-2501-2-3.
4. Kim J, Kim H, Son K, Song Y, Yoon J, Lim H, et al. Medical utilization of big data. Inf Sci Manage 2014;32(3):18-26. (Korean).
5. Lee J, Jae M, Jo M, Son H. Big data utilization trends in the healthcare. J Korea Inst Electronic Commun Sci 2014;2(1):63-75. (Korean).
6. Popovic JR. Distributed data networks: a blueprint for Big Data sharing and healthcare analytics. Ann N Y Acad Sci 2017;1387(1):105-11. DOI: 10.1111/nyas.13287.
7. Pastorino R, De Vito C, Migliara G, Glocker K, Binenbaum I, Ricciardi W, et al. Benefits and challenges of Big Data in healthcare: an overview of the European initiatives. Eur J Public Health 2019;29(Supplement_3):23-27. DOI: 10.1093/eurpub/ckz168.
8. Park YT, Han D. Current status of electronic medical record systems in hospitals and clinics in Korea. Healthc Inform Res 2017;23(3):189-198. (Korean).DOI: 10.4258/hir.2017.23.3.189.
9. Park YT, Kim YS, Yi BK, Kim SM. Clinical decision support functions and digitalization of clinical documents of electronic medical record systems. Healthc Inform Res 2019;25(2):115-23. (Korean).DOI: 10.4258/hir.2019.25.2.115.
10. Seong SC, Kim YY, Khang YH, Heon Park J, Kang HJ, Lee H, et al. Data resource profile: the National Health Information Database of the National Health Insurance Service in South Korea. Int J Epidemiol 2017;46(3):799-800. DOI: 10.1093/ije/dyw253.
11. Kim HH, Kim B, Joo S, Shin SY, Cha HS, Park YR. Why do data users say health care data are difficult to use? A cross-sectional survey study. J Med Internet Res 2019;21(8):e14126. DOI: 10.2196/14126.
12. Yu HW, Choi JY, Park YS, Park HS, Choi Y, Ahn SH, et al. Implementation of a resident night float system in a surgery department in Korea for 6 months: electronic medical record-based big data analysis and medical staff survey. Ann Surg Treat Res 2019;96(5):209-215. DOI: 10.4174/astr.2019.96.5.209.
13. Willems SM, Abeln S, Feenstra KA. The potential use of big data in oncology. Oral Oncol 2019;98:8-12. DOI: 10.1016/j.oraloncology.2019. 09.003.
14. Major A, Cox SM, Volchenboum SL. Using big data in pediatric oncology: current applications and future directions. Semin Oncol 2020;47(1):56-64. DOI: 10.1053/j.seminoncol.2020.02.006.
15. Schlick CJR, Castle JP, Bentrem DJ. Utilizing big data in cancer care. Surg Oncol Clin N Am 2018;27(4):641-652. DOI: 10.1016/j.soc.2018. 05.005.
16. Barone L, Williams J, Micklos D. Unmet needs for analyzing biological big data: a survey of 704 NSF principal investigators. PLoS Comput Biol 2017;13(10):e1005755. DOI: 10.1371/journal.pcbi.1005755.
17. Dillman DA, Smyth JD, Christian LM. Internet, mail, and mixed-mode surveys: The tailored design method. 3rd ed.. Hoboken, NJ: John Wiley & Sons; 2009.
18. Brennan PF, Bakken S. Nursing needs big data and big data needs nursing. J Nurs Scholarsh 2015;47(5):477-484. DOI: 10.1111/jnu.12159.
19. McNutt TR, Moore KL, Quon H. Needs and challenges for big data in radiation oncology. Int J Radiat Oncol Biol Phys 2016;95(3):909-915. DOI: 10.1016/j.ijrobp.2015.11.032.
20. Chen B, Butte AJ. Leveraging big data to transform target selection and drug discovery. Clin Pharmacol Ther 2016;99(3):285-297. DOI: 10.1002/cpt.318.
21. Bini SA. Artificial intelligence, machine learning, deep learning, and cognitive computing: What do these terms mean and how will they impact health care? J Arthroplasty 2018;33(8):2358-2361. DOI: 10. 1016/j.arth.2018.02.067.
22. Capobianco E. Data-driven clinical decision processes: it's time. J Transl Med 2019;17(1):44. DOI: 10.1186/s12967-019-1795-5.
23. Dash S, Shakyawar SK, Sharma M, Kaushik S. Big data in healthcare: management, analysis and future prospects. J Big Data 2019;6:54. DOI: 10.1186/s40537-019-0217-0.
24. Ristevski B, Chen M. Big data analytics in medicine and healthcare. J Integr Bioinform 2018;15(3):20170030. DOI: 10.1515/jib-2017-0030.
25. Viceconti M, Hunter P, Hose R. Big data, big knowledge: big data for personalized healthcare. IEEE J Biomed Health Inform 2015;19(4):1209-1215. DOI: 10.1109/JBHI.2015.2406883.
|
|