Plenty to read!

Plenty to read!

Sci to BI - Similarities between Science & Business Intelligence

Sci to BI - Similarities between Science & Business Intelligence


The below article was written in 2019. I’ve copied and republished it here, to support a new article I’m writing about how Power BI can bring value to academic research.

“I USED TO WORK WITH FLIES….”

It’s a statement that I sometimes make when joking about the breadth of content difference between my previous and current employment. Of course, I didn’t just work with flies; this is just a fun, trivial way of summarizing what I used to do. Until September of 2018, I was immersed in my PhD studies in Biomedical Sciences, where I used the fruit fly as a genetic model to better understand how Insulin is regulated. It seems like a far cry from what I do now, working in Business Intelligence (BI) as a Data Visualization Consultant at Ordina. From Science to Business… Academia to Industry… Biomedicine to IT. When contrasted as such, it seems like I made a professional 180, right? In reality, nothing could be further from the truth. The world of BI is remarkably closer to the world of science and academic research than one would think. From the skills and technologies employed to solve problems to the challenges that are faced; these two worlds are not only similar, but could also learn from one another in the ways that they are different. Because of that, I’m convinced that a career in BI is not only viable but also a tremendously rewarding and satisfying option for someone to pursue after concluding their PhD studies or Postdoctoral work. At its core, the bread and butter of both science and BI are the same – it is all about the data.

Me, on the left, with a lot less hair, doin some researchin’


WHAT IS SCIENTIFIC RESEARCH?

A simplified scientific cycle, with elements unique from Business Intelligence in green (experimental design & scientific publications).

“How does that work?” There’s something - this thing – studied by these researchers… and you find it interesting, too. “Maybe like this?” You read and read – then read some more. Some answers arise, but often more questions, as your ideas iterate to a refined point of action. So how does it work? Well, now you’re going to find out. You and your supervisor… and a few other stakeholders. You design some experiments to capture data, and consolidate it with the data your lab already has. Most of the experiments fail or produce inconclusive results, nothing goes as planned, but you iterate some more. Eventually, from the data you’ve collected, critically appraised, analyzed and visualized… some answers start to surface. But like before, there are often more questions than answers. You gather the results and present them to your research team, peers and other stakeholders. Based on their feedback, you refine your ideas, design some more experiments… and the process repeats. Over and over. The details in-between vary, but there’s plenty of impostor syndrome and self-doubt in there, sprinkled with a few euphoric Eureka moments and seminars with free food to keep you going. Eventually, you reach a sufficient point at which your work is packaged and delivered into a story, told by your data.

In a nutshell, this is Scientific Research; this is a PhD. This is science. In my eyes, at least. It’s an iterative process, driven by testable questions and ideas for which we collect and analyze data in an attempt to answer them. There are a wide array of tools, methods and technologies used depending on the research problem. No one is an expert in all of them, and there is a rapid rate of technological acceleration, but in the knowledge arms race we can quickly learn and use these things effectively. Importantly, during this process, we also have to communicate the data and results. We have to regularly share them with our team and organization, so we can reciprocally accelerate progress, with external peers to share knowledge and critically appraise methods, techniques and results, and with stakeholders such as our evaluators, funding agencies, and the public, to remain accountable professionals. Once communicated, the results are then used to make decisions such as the subsequent research directions of our own or peer groups, or even the development of industrial or societally relevant materials and regulations.


WHAT IS BUSINESS INTELLIGENCE?

A simplified Business Intelligence cycle, with elements unique from Scientific Research in green.

“How is our organization doing?” The company is expanding, processes and metrics are getting more complex, and it’s becoming harder and harder to stay on top of the information. Decisions need to be made, but the right data are not available. “What do we need to do?” They ask. So, now, you’re going to find out. You read, you engage the employees and you research the organization and their technologies. Your team designs an approach to capture non-existent data, consolidating and connecting it with the large volumes coming in from their other systems. Upon execution, from the data your team has collected, critically appraised, analyzed and visualized… a solution starts to surface. But this often leads to new questions or reveal other problems. So it’s presented to peers and stakeholders. Based on their feedback, you refine your ideas, iterate the prototype and continue to improve. Over and over. There’s plenty of politics and expectation management in there, sprinkled with regular deliveries and plenty of Eureka moments and meetings with free coffee to keep you going. Eventually, the solution is packaged and delivered. It’s capable of getting the right data to the organization in an effective way, and the insights your team has found helps them take action and make decisions, providing value in the form of saved time and money.

In a nutshell, this is BI. In my eyes, at least. It’s an iterative process, driven by a need for data-driven decision making and business problems that we try to solve by collecting, managing and analyzing data. There are a wide array of tools, methods and technologies used depending on the business problem and desired technology. No one is an expert in all of them, and there is a rapid rate of technological acceleration, but in the knowledge arms race we can quickly learn and use these things effectively. Importantly, during this process, we also have to communicate the data and results. We have to regularly share them with our team, so we can reciprocally accelerate progress, with peers in our organization to share knowledge and critically appraise methods, techniques and results, and with stakeholders such as business and IT, to remain accountable professionals. Once communicated, the results are then used to make decisions such as the strategic or operational direction of the business or help them monitor the day-to-day situation.

HOW ARE BI & SCIENCE SIMILAR?

While a simplification of both industries, the above summaries provide a number of high-level parallels between BI and Scientific Research. Narratively speaking, there were sufficient similarities that I managed to save myself some time with copy-and-paste. But in addition to that, there are more detailed similarities between the industries that can be discussed. At the foundation of both BI and Science is data: the bread and butter of either industry. Given this common backbone, examining the data lifecycle or pipeline reveals that they follow the same flow, and differ only in approach, tradition or execution:

Data has the same lifecycle in both BI and Science. First, data is captured or collected, before it must be managed in some way. From here, the collected data must be prepared for analysis by transformation, consolidation and modelling. Analytics performed on the data can then provide key answers to questions and problems together with graph and chart visuals. Once made, these must be distributed, while the data and visuals thereafter must be appropriately governed and maintained.

As an example, there are two areas that are most similar between BI and Science: Data Analysis and Data Exploration/Visualization. This is because they should employ identical best practices and techniques to achieve the same result: interpretable and reliable conclusions from the data that are concisely and effectively communicated in a visual way. To do so, similar tools are used, such as R and Python, but distinct tools also exist, such as BI software like Power BI and scientific imaging software like FIJI. A data analyst in a business is trying to accomplish the same thing as a research scientist when they are analyzing or communicating data. This, too, makes apparent the overlap in skills that experts in either profession have; the ability to have a critical and analytical mindset while being an effective data communicator. This isn’t to say that there aren’t obvious differences between BI and scientific research in these domains, though. In science, analysis typically seeks to identify a statistically significant difference between two groups – known as hypothesis testing - while in BI, this varies from gross comparisons and trends to forecasting and statistical modelling.

In this data pipeline, BI and science differ most in the management and distribution of data. In BI, the concept of Data Warehousing is essential for data management. Actually, Data Warehousing lies at the fundamental core of BI as a whole. Data Warehousing is the organization and storage of the data in an architecture that makes it easier to use and maintain, as well as increasing the power and performance of analytics or tools using that data with good data models. If that’s not clear for readers from the science world, this is equivalent to having a lab book, organized and labelled reagent boxes and microscope slides. If those things didn’t exist, your samples – and therefore your data – would be harder to trace, use, and rely on. The same goes for a Data Warehouse.

This is particularly important in large organizations that rely on such organized data infrastructures to ensure that their data is handled properly and effectively. In contrast, once entering digital format, scientific data is rarely organized into such large architectures. Instead, scientific data are often kept as their original output files, unconsolidated and stored either on local machines or on-premise servers, or on basic cloud storage platforms like OneDrive or Google Drive. This is arguably because BI deals with larger collective data “hubs” in organizations, whereas in scientific research, the data is not typically consolidated above a few research groups or, at most, a consortium of researchers. As the volumes and diversity of data used in research continue to grow, however, their needs will evolve to me more similar, whereupon Data Warehousing might provide a useful solution to an upcoming problem.

                In data distribution & publication, BI and science differ in that in BI the data is used typically by business or technical users in interactive reports. These reports need to be refreshed on a regular basis (or might even be kept up-to-date with a real-time frequency) to help users run their business in a data-driven way. Thus, the data consumption rate is higher, as the data needs to be used more than once on an often daily-basis. The architecture and technology underlying these reports also reflects this need, requiring performant and stable connections to good data models. In contrast, data in science is consumed via static figures that compliment either oral presentations or written scientific articles discussing that data. Here, the data is not provided with a dynamic connection or a specific availability, only as a reproduced copy of the final result. Thus, the data is consumed ad hoc to produce the final result. Similarly, the tradition in science is for the data to be split between multiple figures that – in sequence – tell the data story. This is different than in modern BI tools, which provide more data in a single view that can be filtered, drilled or interacted with in a way to provide on-demand information depending on what the user wants to see. As datasets continue to expand in size and scope in science. However, such interactive visualizations and higher data consumption rates are starting to become more commonplace, even if the underlying BI-like architecture is not.

Thus, data in BI and in scientific research follows the same pipeline. While they are highly similar in how the data is analyzed and visualized, they differ in how the data is managed and distributed. Despite this, the two industries are rivers meandering down the same mountain, following a parallel trajectory to the same goal – data-driven decision-making in a digitally accelerating world.

The value Power BI could bring to Academic Research

The value Power BI could bring to Academic Research

Can't Enable Audit Log in Office Compliance Portal

Can't Enable Audit Log in Office Compliance Portal

0