The Intelligent Internet of Things(IIoT)involves real-world things that communicate or interact with each other through networking technologies by collecting data from these“things”and using intelligent approaches,s...The Intelligent Internet of Things(IIoT)involves real-world things that communicate or interact with each other through networking technologies by collecting data from these“things”and using intelligent approaches,such as Artificial Intelligence(AI)and machine learning,to make accurate decisions.Data science is the science of dealing with data and its relationships through intelligent approaches.Most state-of-the-art research focuses independently on either data science or IIoT,rather than exploring their integration.Therefore,to address the gap,this article provides a comprehensive survey on the advances and integration of data science with the Intelligent IoT(IIoT)system by classifying the existing IoT-based data science techniques and presenting a summary of various characteristics.The paper analyzes the data science or big data security and privacy features,including network architecture,data protection,and continuous monitoring of data,which face challenges in various IoT-based systems.Extensive insights into IoT data security,privacy,and challenges are visualized in the context of data science for IoT.In addition,this study reveals the current opportunities to enhance data science and IoT market development.The current gap and challenges faced in the integration of data science and IoT are comprehensively presented,followed by the future outlook and possible solutions.展开更多
Due to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and futur...Due to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and future jobs, and thus student careers. At the heart of this digital transformation is data science, the discipline that makes sense of big data. With many rapidly emerging digital challenges ahead of us, this article discusses perspectives on iSchools' opportunities and suggestions in data science education. We argue that iSchools should empower their students with "information computing" disciplines, which we define as the ability to solve problems and create values, information, and knowledge using tools in application domains. As specific approaches to enforcing information computing disciplines in data science education, we suggest the three foci of user-based, tool-based, and application- based. These three loci will serve to differentiate the data science education of iSchools from that of computer science or business schools. We present a layered Data Science Education Framework (DSEF) with building blocks that include the three pillars of data science (people, technology, and data), computational thinking, data-driven paradigms, and data science lifecycles. Data science courses built on the top of this framework should thus be executed with user-based, tool-based, and application-based approaches. This framework will help our students think about data science problems from the big picture perspective and foster appropriate problem-solving skills in conjunction with broad perspectives of data science lifecycles. We hope the DSEF discussed in this article will help fellow iSchools in their design of new data science curricula.展开更多
Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advanc...Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advancing the synergy between metadata and data science, and identifies pathways for developing a more cohesive metadata research agenda in data science. Design/methodology/approach: This paper identifies factors that challenge metadata research in the digital ecosystem, defines metadata and data science, and presents the concepts big metadata, smart metadata, and metadata capital as part of a metadata lingua franca connecting to data science. Findings: The "utilitarian nature" and "historical and traditional views" of metadata are identified as two intersecting factors that have inhibited metadata research. Big metadata, smart metadata, and metadata capital are presented as part ofa metadata linguafranca to help frame research in the data science research space. Research limitations: There are additional, intersecting factors to consider that likely inhibit metadata research, and other significant metadata concepts to explore. Practical implications: The immediate contribution of this work is that it may elicit response, critique, revision, or, more significantly, motivate research. The work presented can encourage more researchers to consider the significance of metadata as a research worthy topic within data science and the larger digital ecosystem. Originality/value: Although metadata research has not kept pace with other data science topics, there is little attention directed to this problem. This is surprising, given that metadata is essential for data science endeavors. This examination synthesizes original and prior scholarship to provide new grounding for metadata research in data science.展开更多
Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data...Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data, for the first time, has emerged as an extremely significant approach in materials discovery. Data science has been applied in different disciplines as an interdisciplinary field to extract knowledge from data. The concept of materials data science has been utilized to demonstrate its application in materials science. To explore its potential as an active research branch in the big data era, a three-tier system has been put forward to define the infrastructure for the classification, curation and knowledge extraction of materials data.展开更多
In present digital era,data science techniques exploit artificial intelligence(AI)techniques who start and run small and medium-sized enterprises(SMEs)to have an impact and develop their businesses.Data science integr...In present digital era,data science techniques exploit artificial intelligence(AI)techniques who start and run small and medium-sized enterprises(SMEs)to have an impact and develop their businesses.Data science integrates the conventions of econometrics with the technological elements of data science.It make use of machine learning(ML),predictive and prescriptive analytics to effectively understand financial data and solve related problems.Smart technologies for SMEs enable allows the firm to get smarter with their processes and offers efficient operations.At the same time,it is needed to develop an effective tool which can assist small to medium sized enterprises to forecast business failure as well as financial crisis.AI becomes a familiar tool for several businesses due to the fact that it concentrates on the design of intelligent decision making tools to solve particular real time problems.With this motivation,this paper presents a new AI based optimal functional link neural network(FLNN)based financial crisis prediction(FCP)model forSMEs.The proposed model involves preprocessing,feature selection,classification,and parameter tuning.At the initial stage,the financial data of the enterprises are collected and are preprocessed to enhance the quality of the data.Besides,a novel chaotic grasshopper optimization algorithm(CGOA)based feature selection technique is applied for the optimal selection of features.Moreover,functional link neural network(FLNN)model is employed for the classification of the feature reduced data.Finally,the efficiency of theFLNNmodel can be improvised by the use of cat swarm optimizer(CSO)algorithm.A detailed experimental validation process takes place on Polish dataset to ensure the performance of the presented model.The experimental studies demonstrated that the CGOA-FLNN-CSO model has accomplished maximum prediction accuracy of 98.830%,92.100%,and 95.220%on the applied Polish dataset Year I-III respectively.展开更多
There has long been discussion about the distinctions of library science,information science,and informatics,and how these areas differ and overlap with computer science.Today the term data science is emerging that ge...There has long been discussion about the distinctions of library science,information science,and informatics,and how these areas differ and overlap with computer science.Today the term data science is emerging that generates excitement and questions about how it relates to and differs from these other areas of study.展开更多
The rise or fall of the stock markets directly affects investors’interest and loyalty.Therefore,it is necessary to measure the performance of stocks in the market in advance to prevent our assets from suffering signi...The rise or fall of the stock markets directly affects investors’interest and loyalty.Therefore,it is necessary to measure the performance of stocks in the market in advance to prevent our assets from suffering significant losses.In our proposed study,six supervised machine learning(ML)strategies and deep learning(DL)models with long short-term memory(LSTM)of data science was deployed for thorough analysis and measurement of the performance of the technology stocks.Under discussion are Apple Inc.(AAPL),Microsoft Corporation(MSFT),Broadcom Inc.,Taiwan Semiconductor Manufacturing Company Limited(TSM),NVIDIA Corporation(NVDA),and Avigilon Corporation(AVGO).The datasets were taken from the Yahoo Finance API from 06-05-2005 to 06-05-2022(seventeen years)with 4280 samples.As already noted,multiple studies have been performed to resolve this problem using linear regression,support vectormachines,deep long short-termmemory(LSTM),and many other models.In this research,the Hidden Markov Model(HMM)outperformed other employed machine learning ensembles,tree-based models,the ARIMA(Auto Regressive IntegratedMoving Average)model,and long short-term memory with a robust mean accuracy score of 99.98.Other statistical analyses and measurements for machine learning ensemble algorithms,the Long Short-TermModel,and ARIMA were also carried out for further investigation of the performance of advanced models for forecasting time series data.Thus,the proposed research found the best model to be HMM,and LSTM was the second-best model that performed well in all aspects.A developedmodel will be highly recommended and helpful for early measurement of technology stock performance for investment or withdrawal based on the future stock rise or fall for creating smart environments.展开更多
This paper reviews literature pertaining to the development of data science as a discipline,current issues with data bias and ethics,and the role that the discipline of information science may play in addressing these...This paper reviews literature pertaining to the development of data science as a discipline,current issues with data bias and ethics,and the role that the discipline of information science may play in addressing these concerns.Information science research and researchers have much to offer for data science,owing to their background as transdisciplinary scholars who apply human-centered and social-behavioral perspectives to issues within natural science disciplines.Information science researchers have already contributed to a humanistic approach to data ethics within the literature and an emphasis on data science within information schools all but ensures that this literature will continue to grow in coming decades.This review article serves as a reference for the history,current progress,and potential future directions of data ethics research within the corpus of information science literature.展开更多
In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary cha...In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures.展开更多
Introduction Within the field of scientometrics,which involves quantitative studies of science,the citation analysis specialism counts citations between academic papers in order to help evaluate the impact of the cite...Introduction Within the field of scientometrics,which involves quantitative studies of science,the citation analysis specialism counts citations between academic papers in order to help evaluate the impact of the cited work(Moed,2006).展开更多
In this editorial,we comment on the current development and deployment of data science in intensive care units(ICUs).Data in ICUs can be classified into qualitative and quantitative data with different technologies ne...In this editorial,we comment on the current development and deployment of data science in intensive care units(ICUs).Data in ICUs can be classified into qualitative and quantitative data with different technologies needed to translate and interpret them.Data science,in the form of artificial intelligence(AI),should find the right interaction between physicians,data and algorithm.For individual patients and physicians,sepsis and mechanical ventilation have been two important aspects where AI has been extensively studied.However,major risks of bias,lack of generalizability and poor clinical values remain.AI deployment in the ICUs should be emphasized more to facilitate AI development.For ICU management,AI has a huge potential in transforming resource allocation.The coronavirus disease 2019 pandemic has given opportunities to establish such systems which should be investigated further.Ethical concerns must be addressed when designing such AI.展开更多
With the rapid development of data science and artificial intelligence technology,its application in education in the field of extensive,which is of great significance to promote educational equity.By collecting and a...With the rapid development of data science and artificial intelligence technology,its application in education in the field of extensive,which is of great significance to promote educational equity.By collecting and analyzing students’data,personalized learning provides customized learning path;the intelligent auxiliary education system provides personalized guidance to reduce the burden of teachers.This paper discusses the strategies of data science and artificial intelligence in promoting educational equity,including the establishment of a comprehensive student data collection and analysis system and the promotion of intelligent auxiliary education system,aiming to realize the optimal allocation of educational resources,so that every student can enjoy fair and high-quality education.展开更多
In order to conduct research and analysis on the construction of application-oriented undergraduate data science and big data technology courses,the professional development characteristics of universities and enterpr...In order to conduct research and analysis on the construction of application-oriented undergraduate data science and big data technology courses,the professional development characteristics of universities and enterprises should be taken into consideration,the development trend of the big data industry should be scrutinized,and professional application-oriented talents should be cultivated in line with job requirements.This paper expounds the demand for capacity-building professional development in application-oriented undergraduate data science and big data technology courses,conducts research and analysis on the current situation of professional development,and puts forward strategies in hope to provide reference for capacity-building professional development.展开更多
The entering into big data era gives rise to a novel discipline called Data Science.Data Science is interdisciplinary in its nature,and the existing relevant studies can be categorized into domain-independent studies ...The entering into big data era gives rise to a novel discipline called Data Science.Data Science is interdisciplinary in its nature,and the existing relevant studies can be categorized into domain-independent studies and domain-dependent studies.The domain-dependent studies and domain-independent ones are evolving into Domain-general Data Science and Domain-specific Data Science.Domain-general Data Science emphasizes Data Science in a general sense,involving concepts,theories,methods,technologies,and tools.Domain-specific Data Science is a variant of Domain-general Data Science and varies from one domain to another.The most popular Domain-specific Data Science includes Data journalism,Industrial Data Science,Business Data Science,Health Data Science,Biological Data Science,Social Data Science,and Agile Data Science.The difference between Domain-general Data Science and Domain-specific Data Science roots in their thinking paradigms:DGDS conforms to data-centered thinking,while DSDS is in line with knowledge-centered thinking.As a result,DGDS focuses on the theoretical studies,while DSDS is centered on applied ones.However,DSDS and DGDS possess complementary advantages.Theoretical Data Science(TDS)is a new branch of Data Science that employs mathematical models and abstractions of data objects and systems to rationalize,explain and predict big data phenomena.TDS will bridge the gap between DGDS and DSDS.TDS contrasts with DSDS,which uses casual analysis,as well as DGDS,which employs data-centered thinking to deal with big data problems in that it balances the usability and the interpretability of Data Science practices.The main concerns of TDS are concentrated on integrating the data-centered thinking with the knowledge-centered thinking as well as transforming a correlation analysis into the casual analysis.Hence,TDS can bridge the gaps between DGDS and DSDS,and balance the usability and the interpretability of big data solutions.The studies of TDS should be focused on the following research purpose:to develop theoretical studies of TDS,to take advantages of active property of big data,to embrace design of experiments,to enhance causality analysis,and to develop data products.展开更多
This article presents views on the future development of data science,with a particular focus on its importance to artificial intel-ligence(AI).After discussing the challenges of data science,it elu-cidates a possible...This article presents views on the future development of data science,with a particular focus on its importance to artificial intel-ligence(AI).After discussing the challenges of data science,it elu-cidates a possible approach to tackle these challenges by clarifying the logic and principles of data related to the multi-level complex-ity of the world.Finally,urgently required actions are briefly outlined.展开更多
Purpose–Data science is the study of the generalizable extraction of knowledge from data.It includes a variety of components and develops on methods and concepts from many domains,containing mathematics,probability m...Purpose–Data science is the study of the generalizable extraction of knowledge from data.It includes a variety of components and develops on methods and concepts from many domains,containing mathematics,probability models,machine learning,statistical learning,computer programming,data engineering,pattern recognition and learning,visualization and data warehousing aiming to extract value from data.The purpose of this paper is to provide an overview of open source(OS)data science tools,proposing a classification scheme that can be used to study OS data science software.Design/methodology/approach–The proposed classification scheme is based on general characteristics,project activity,operational characteristics and data mining characteristics.The authors then use the proposed scheme to examine 70 identified Open Source Software.From this the authors provide insight about the current status of OS data science tools and reveal the state-of-the-art tools.Findings–The features of 70 OS tools are recorded based on the criteria of the four group characteristics,general characteristics,project activity,operational characteristics and data mining characteristics.Interesting results came from the analysis of these features and are recorded here.Originality/value–The contribution of this survey is development of a new classification scheme for examination and study of OS data science tools.In parallel,this study provides an overview of existing OS data science tools.展开更多
The digital transformation of our society coupled with the increasing exploitation of natural resources makes sustainability challenges more complex and dynamic than ever before.These changes will unlikely stop or eve...The digital transformation of our society coupled with the increasing exploitation of natural resources makes sustainability challenges more complex and dynamic than ever before.These changes will unlikely stop or even decelerate in the near future.There is an urgent need for a new scientific approach and an advanced form of evidence-based decisionmaking towards the benefit of society,the economy,and the environment.To understand the impacts and interrelationships between humans as a society and natural Earth system processes,we propose a new engineering discipline,Big Earth Data science.This science is called to provide the methodologies and tools to generate knowledge from diverse,numerous,and complex data sources necessary to ensure a sustainable human society essential for the preservation of planet Earth.Big Earth Data science aims at utilizing data from Earth observation and social sensing and develop theories for understanding the mechanisms of how such a social-physical system operates and evolves.The manuscript introduces the universe of discourse characterizing this new science,its foundational paradigms and methodologies,and a possible technological framework to be implemented by applying an ecosystem approach.CASEarth and GEOSS are presented as examples of international implementation attempts.Conclusions discuss important challenges and collaboration opportunities.展开更多
Big data has attracted much attention from academia and industry.But the discussion of big data is disparate,fragmented and distributed among different outlets.This paper conducts a systematic and extensive review on ...Big data has attracted much attention from academia and industry.But the discussion of big data is disparate,fragmented and distributed among different outlets.This paper conducts a systematic and extensive review on 186 journal publications about big data from 2011 to 2015 in the Science Citation Index(SCI)and the Social Science Citation Index(SSCI)database aiming to provide scholars and practitioners with a comprehensive overview and big picture about research on big data.The selected papers are grouped into 20 research categories.The contents of the paper(s)in each research category are summarized.Research directions for each category are outlined as well.The results in this study indicate that the selected papers were mainly published between 2013 and 2015 and focus on technological issues regarding big data.Diverse new approaches,methods,frameworks and systems are proposed for data collection,storage,transport,processing and analysis in the selected papers.Possible directions for future research on big data are discussed.展开更多
In response to public scrutiny of data-driven algorithms,the field of data science has adopted ethics training and principles.Although ethics can help data scientists reflect on certain normative aspects of their work...In response to public scrutiny of data-driven algorithms,the field of data science has adopted ethics training and principles.Although ethics can help data scientists reflect on certain normative aspects of their work,such efforts are ill-equipped to generate a data science that avoids social harms and promotes social justice.In this article,I argue that data science must embrace a political orientation.Data scientists must recognize themselves as political actors engaged in normative constructions of society and evaluate their work according to its downstream impacts on people’s lives.I first articulate why data scientists must recognize themselves as political actors.In this section,I respond to three arguments that data scientists commonly invoke when challenged to take political positions regarding their work.In confronting these arguments,I describe why attempting to remain apolitical is itself a political stance-a fundamentally conservative one-and why data science’s attempts to promote“social good”dangerously rely on unarticulated and incrementalist political assumptions.I then propose a framework for how data science can evolve toward a deliberative and rigorous politics of social justice.I conceptualize the process of developing a politically engaged data science as a sequence of four stages.Pursuing these new approaches will empower data scientists with new methods for thoughtfully and rigorously contributing to social justice.展开更多
基金supported in part by the National Natural Science Foundation of China under Grant 62371181in part by the Changzhou Science and Technology International Cooperation Program under Grant CZ20230029+1 种基金supported by a National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(2021R1A2B5B02087169)supported under the framework of international cooperation program managed by the National Research Foundation of Korea(2022K2A9A1A01098051)。
文摘The Intelligent Internet of Things(IIoT)involves real-world things that communicate or interact with each other through networking technologies by collecting data from these“things”and using intelligent approaches,such as Artificial Intelligence(AI)and machine learning,to make accurate decisions.Data science is the science of dealing with data and its relationships through intelligent approaches.Most state-of-the-art research focuses independently on either data science or IIoT,rather than exploring their integration.Therefore,to address the gap,this article provides a comprehensive survey on the advances and integration of data science with the Intelligent IoT(IIoT)system by classifying the existing IoT-based data science techniques and presenting a summary of various characteristics.The paper analyzes the data science or big data security and privacy features,including network architecture,data protection,and continuous monitoring of data,which face challenges in various IoT-based systems.Extensive insights into IoT data security,privacy,and challenges are visualized in the context of data science for IoT.In addition,this study reveals the current opportunities to enhance data science and IoT market development.The current gap and challenges faced in the integration of data science and IoT are comprehensively presented,followed by the future outlook and possible solutions.
文摘Due to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and future jobs, and thus student careers. At the heart of this digital transformation is data science, the discipline that makes sense of big data. With many rapidly emerging digital challenges ahead of us, this article discusses perspectives on iSchools' opportunities and suggestions in data science education. We argue that iSchools should empower their students with "information computing" disciplines, which we define as the ability to solve problems and create values, information, and knowledge using tools in application domains. As specific approaches to enforcing information computing disciplines in data science education, we suggest the three foci of user-based, tool-based, and application- based. These three loci will serve to differentiate the data science education of iSchools from that of computer science or business schools. We present a layered Data Science Education Framework (DSEF) with building blocks that include the three pillars of data science (people, technology, and data), computational thinking, data-driven paradigms, and data science lifecycles. Data science courses built on the top of this framework should thus be executed with user-based, tool-based, and application-based approaches. This framework will help our students think about data science problems from the big picture perspective and foster appropriate problem-solving skills in conjunction with broad perspectives of data science lifecycles. We hope the DSEF discussed in this article will help fellow iSchools in their design of new data science curricula.
文摘Purpose: The purpose of the paper is to provide a framework for addressing the disconnect between metadata and data science. Data science cannot progress without metadata research.This paper takes steps toward advancing the synergy between metadata and data science, and identifies pathways for developing a more cohesive metadata research agenda in data science. Design/methodology/approach: This paper identifies factors that challenge metadata research in the digital ecosystem, defines metadata and data science, and presents the concepts big metadata, smart metadata, and metadata capital as part of a metadata lingua franca connecting to data science. Findings: The "utilitarian nature" and "historical and traditional views" of metadata are identified as two intersecting factors that have inhibited metadata research. Big metadata, smart metadata, and metadata capital are presented as part ofa metadata linguafranca to help frame research in the data science research space. Research limitations: There are additional, intersecting factors to consider that likely inhibit metadata research, and other significant metadata concepts to explore. Practical implications: The immediate contribution of this work is that it may elicit response, critique, revision, or, more significantly, motivate research. The work presented can encourage more researchers to consider the significance of metadata as a research worthy topic within data science and the larger digital ecosystem. Originality/value: Although metadata research has not kept pace with other data science topics, there is little attention directed to this problem. This is surprising, given that metadata is essential for data science endeavors. This examination synthesizes original and prior scholarship to provide new grounding for metadata research in data science.
基金Project supported by the National Key R&D Program of China(Grant No.2016YFB0700503)the National High Technology Research and Development Program of China(Grant No.2015AA03420)+2 种基金Beijing Municipal Science and Technology Project,China(Grant No.D161100002416001)the National Natural Science Foundation of China(Grant No.51172018)Kennametal Inc
文摘Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data, for the first time, has emerged as an extremely significant approach in materials discovery. Data science has been applied in different disciplines as an interdisciplinary field to extract knowledge from data. The concept of materials data science has been utilized to demonstrate its application in materials science. To explore its potential as an active research branch in the big data era, a three-tier system has been put forward to define the infrastructure for the classification, curation and knowledge extraction of materials data.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under Grant Number(RGP 1/147/42),www.kku.edu.sa.This research was funded by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the Fast-Track Path of Research Funding Program.
文摘In present digital era,data science techniques exploit artificial intelligence(AI)techniques who start and run small and medium-sized enterprises(SMEs)to have an impact and develop their businesses.Data science integrates the conventions of econometrics with the technological elements of data science.It make use of machine learning(ML),predictive and prescriptive analytics to effectively understand financial data and solve related problems.Smart technologies for SMEs enable allows the firm to get smarter with their processes and offers efficient operations.At the same time,it is needed to develop an effective tool which can assist small to medium sized enterprises to forecast business failure as well as financial crisis.AI becomes a familiar tool for several businesses due to the fact that it concentrates on the design of intelligent decision making tools to solve particular real time problems.With this motivation,this paper presents a new AI based optimal functional link neural network(FLNN)based financial crisis prediction(FCP)model forSMEs.The proposed model involves preprocessing,feature selection,classification,and parameter tuning.At the initial stage,the financial data of the enterprises are collected and are preprocessed to enhance the quality of the data.Besides,a novel chaotic grasshopper optimization algorithm(CGOA)based feature selection technique is applied for the optimal selection of features.Moreover,functional link neural network(FLNN)model is employed for the classification of the feature reduced data.Finally,the efficiency of theFLNNmodel can be improvised by the use of cat swarm optimizer(CSO)algorithm.A detailed experimental validation process takes place on Polish dataset to ensure the performance of the presented model.The experimental studies demonstrated that the CGOA-FLNN-CSO model has accomplished maximum prediction accuracy of 98.830%,92.100%,and 95.220%on the applied Polish dataset Year I-III respectively.
文摘There has long been discussion about the distinctions of library science,information science,and informatics,and how these areas differ and overlap with computer science.Today the term data science is emerging that generates excitement and questions about how it relates to and differs from these other areas of study.
基金supported by Kyungpook National University Research Fund,2020.
文摘The rise or fall of the stock markets directly affects investors’interest and loyalty.Therefore,it is necessary to measure the performance of stocks in the market in advance to prevent our assets from suffering significant losses.In our proposed study,six supervised machine learning(ML)strategies and deep learning(DL)models with long short-term memory(LSTM)of data science was deployed for thorough analysis and measurement of the performance of the technology stocks.Under discussion are Apple Inc.(AAPL),Microsoft Corporation(MSFT),Broadcom Inc.,Taiwan Semiconductor Manufacturing Company Limited(TSM),NVIDIA Corporation(NVDA),and Avigilon Corporation(AVGO).The datasets were taken from the Yahoo Finance API from 06-05-2005 to 06-05-2022(seventeen years)with 4280 samples.As already noted,multiple studies have been performed to resolve this problem using linear regression,support vectormachines,deep long short-termmemory(LSTM),and many other models.In this research,the Hidden Markov Model(HMM)outperformed other employed machine learning ensembles,tree-based models,the ARIMA(Auto Regressive IntegratedMoving Average)model,and long short-term memory with a robust mean accuracy score of 99.98.Other statistical analyses and measurements for machine learning ensemble algorithms,the Long Short-TermModel,and ARIMA were also carried out for further investigation of the performance of advanced models for forecasting time series data.Thus,the proposed research found the best model to be HMM,and LSTM was the second-best model that performed well in all aspects.A developedmodel will be highly recommended and helpful for early measurement of technology stock performance for investment or withdrawal based on the future stock rise or fall for creating smart environments.
文摘This paper reviews literature pertaining to the development of data science as a discipline,current issues with data bias and ethics,and the role that the discipline of information science may play in addressing these concerns.Information science research and researchers have much to offer for data science,owing to their background as transdisciplinary scholars who apply human-centered and social-behavioral perspectives to issues within natural science disciplines.Information science researchers have already contributed to a humanistic approach to data ethics within the literature and an emphasis on data science within information schools all but ensures that this literature will continue to grow in coming decades.This review article serves as a reference for the history,current progress,and potential future directions of data ethics research within the corpus of information science literature.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number(RGP 2/42/43)This work was supported by Taif University Researchers Supporting Program(project number:TURSP-2020/200),Taif University,Saudi Arabia.
文摘In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures.
文摘Introduction Within the field of scientometrics,which involves quantitative studies of science,the citation analysis specialism counts citations between academic papers in order to help evaluate the impact of the cited work(Moed,2006).
文摘In this editorial,we comment on the current development and deployment of data science in intensive care units(ICUs).Data in ICUs can be classified into qualitative and quantitative data with different technologies needed to translate and interpret them.Data science,in the form of artificial intelligence(AI),should find the right interaction between physicians,data and algorithm.For individual patients and physicians,sepsis and mechanical ventilation have been two important aspects where AI has been extensively studied.However,major risks of bias,lack of generalizability and poor clinical values remain.AI deployment in the ICUs should be emphasized more to facilitate AI development.For ICU management,AI has a huge potential in transforming resource allocation.The coronavirus disease 2019 pandemic has given opportunities to establish such systems which should be investigated further.Ethical concerns must be addressed when designing such AI.
文摘With the rapid development of data science and artificial intelligence technology,its application in education in the field of extensive,which is of great significance to promote educational equity.By collecting and analyzing students’data,personalized learning provides customized learning path;the intelligent auxiliary education system provides personalized guidance to reduce the burden of teachers.This paper discusses the strategies of data science and artificial intelligence in promoting educational equity,including the establishment of a comprehensive student data collection and analysis system and the promotion of intelligent auxiliary education system,aiming to realize the optimal allocation of educational resources,so that every student can enjoy fair and high-quality education.
文摘In order to conduct research and analysis on the construction of application-oriented undergraduate data science and big data technology courses,the professional development characteristics of universities and enterprises should be taken into consideration,the development trend of the big data industry should be scrutinized,and professional application-oriented talents should be cultivated in line with job requirements.This paper expounds the demand for capacity-building professional development in application-oriented undergraduate data science and big data technology courses,conducts research and analysis on the current situation of professional development,and puts forward strategies in hope to provide reference for capacity-building professional development.
基金the Ministry of education of Humanities and Social Science project(Project No.20YJA870003)
文摘The entering into big data era gives rise to a novel discipline called Data Science.Data Science is interdisciplinary in its nature,and the existing relevant studies can be categorized into domain-independent studies and domain-dependent studies.The domain-dependent studies and domain-independent ones are evolving into Domain-general Data Science and Domain-specific Data Science.Domain-general Data Science emphasizes Data Science in a general sense,involving concepts,theories,methods,technologies,and tools.Domain-specific Data Science is a variant of Domain-general Data Science and varies from one domain to another.The most popular Domain-specific Data Science includes Data journalism,Industrial Data Science,Business Data Science,Health Data Science,Biological Data Science,Social Data Science,and Agile Data Science.The difference between Domain-general Data Science and Domain-specific Data Science roots in their thinking paradigms:DGDS conforms to data-centered thinking,while DSDS is in line with knowledge-centered thinking.As a result,DGDS focuses on the theoretical studies,while DSDS is centered on applied ones.However,DSDS and DGDS possess complementary advantages.Theoretical Data Science(TDS)is a new branch of Data Science that employs mathematical models and abstractions of data objects and systems to rationalize,explain and predict big data phenomena.TDS will bridge the gap between DGDS and DSDS.TDS contrasts with DSDS,which uses casual analysis,as well as DGDS,which employs data-centered thinking to deal with big data problems in that it balances the usability and the interpretability of Data Science practices.The main concerns of TDS are concentrated on integrating the data-centered thinking with the knowledge-centered thinking as well as transforming a correlation analysis into the casual analysis.Hence,TDS can bridge the gaps between DGDS and DSDS,and balance the usability and the interpretability of big data solutions.The studies of TDS should be focused on the following research purpose:to develop theoretical studies of TDS,to take advantages of active property of big data,to embrace design of experiments,to enhance causality analysis,and to develop data products.
文摘This article presents views on the future development of data science,with a particular focus on its importance to artificial intel-ligence(AI).After discussing the challenges of data science,it elu-cidates a possible approach to tackle these challenges by clarifying the logic and principles of data related to the multi-level complex-ity of the world.Finally,urgently required actions are briefly outlined.
基金The research leading to the results presented in this paper has received funding from the European Union Seventh Framework Programme(FP7-2012-NMP-ICT-FoF)under Grant Agreement No.314364.
文摘Purpose–Data science is the study of the generalizable extraction of knowledge from data.It includes a variety of components and develops on methods and concepts from many domains,containing mathematics,probability models,machine learning,statistical learning,computer programming,data engineering,pattern recognition and learning,visualization and data warehousing aiming to extract value from data.The purpose of this paper is to provide an overview of open source(OS)data science tools,proposing a classification scheme that can be used to study OS data science software.Design/methodology/approach–The proposed classification scheme is based on general characteristics,project activity,operational characteristics and data mining characteristics.The authors then use the proposed scheme to examine 70 identified Open Source Software.From this the authors provide insight about the current status of OS data science tools and reveal the state-of-the-art tools.Findings–The features of 70 OS tools are recorded based on the criteria of the four group characteristics,general characteristics,project activity,operational characteristics and data mining characteristics.Interesting results came from the analysis of these features and are recorded here.Originality/value–The contribution of this survey is development of a new classification scheme for examination and study of OS data science tools.In parallel,this study provides an overview of existing OS data science tools.
基金the Strategic Priority Research Program of the Chinese Academy of Sciences(grant numbers XDA19030000 and XDA19090000)the DG Research and Innovation of the European Commission(H2020 grant number 34538).
文摘The digital transformation of our society coupled with the increasing exploitation of natural resources makes sustainability challenges more complex and dynamic than ever before.These changes will unlikely stop or even decelerate in the near future.There is an urgent need for a new scientific approach and an advanced form of evidence-based decisionmaking towards the benefit of society,the economy,and the environment.To understand the impacts and interrelationships between humans as a society and natural Earth system processes,we propose a new engineering discipline,Big Earth Data science.This science is called to provide the methodologies and tools to generate knowledge from diverse,numerous,and complex data sources necessary to ensure a sustainable human society essential for the preservation of planet Earth.Big Earth Data science aims at utilizing data from Earth observation and social sensing and develop theories for understanding the mechanisms of how such a social-physical system operates and evolves.The manuscript introduces the universe of discourse characterizing this new science,its foundational paradigms and methodologies,and a possible technological framework to be implemented by applying an ecosystem approach.CASEarth and GEOSS are presented as examples of international implementation attempts.Conclusions discuss important challenges and collaboration opportunities.
文摘Big data has attracted much attention from academia and industry.But the discussion of big data is disparate,fragmented and distributed among different outlets.This paper conducts a systematic and extensive review on 186 journal publications about big data from 2011 to 2015 in the Science Citation Index(SCI)and the Social Science Citation Index(SSCI)database aiming to provide scholars and practitioners with a comprehensive overview and big picture about research on big data.The selected papers are grouped into 20 research categories.The contents of the paper(s)in each research category are summarized.Research directions for each category are outlined as well.The results in this study indicate that the selected papers were mainly published between 2013 and 2015 and focus on technological issues regarding big data.Diverse new approaches,methods,frameworks and systems are proposed for data collection,storage,transport,processing and analysis in the selected papers.Possible directions for future research on big data are discussed.
文摘In response to public scrutiny of data-driven algorithms,the field of data science has adopted ethics training and principles.Although ethics can help data scientists reflect on certain normative aspects of their work,such efforts are ill-equipped to generate a data science that avoids social harms and promotes social justice.In this article,I argue that data science must embrace a political orientation.Data scientists must recognize themselves as political actors engaged in normative constructions of society and evaluate their work according to its downstream impacts on people’s lives.I first articulate why data scientists must recognize themselves as political actors.In this section,I respond to three arguments that data scientists commonly invoke when challenged to take political positions regarding their work.In confronting these arguments,I describe why attempting to remain apolitical is itself a political stance-a fundamentally conservative one-and why data science’s attempts to promote“social good”dangerously rely on unarticulated and incrementalist political assumptions.I then propose a framework for how data science can evolve toward a deliberative and rigorous politics of social justice.I conceptualize the process of developing a politically engaged data science as a sequence of four stages.Pursuing these new approaches will empower data scientists with new methods for thoughtfully and rigorously contributing to social justice.