The End Of Theory (Anderson):
In the article written by Chris Anderson, he explains the importance of using Big Data to find correlations between different aspects of life. He says that the scientific method is obsolete and that we do not need to care about the reasons why things happen, instead saying that “Correlation is enough.” His argument is compelling because of the effectiveness of using these petabytes of data to “let statistical algorithms find patterns where (traditional) science cannot.” Using these broad amounts of data, we can advance areas of science at a much faster rate, as J. Craig Venter has tremendously advanced the field of biology. I agree with Anderson’s ideas of how we can be much more efficient with our time, using this methodology to fix issues that would previously be too difficult to maneuver. However, I think he is too eager to abandon the idea of searching for why things happen. Even if we have a correlation between two variables, people still need to understand why things happen, in order to solve problems more effectively.
Big Data Epistemologies and Paradigm Shifts (Kitchin):
The beginning of Rob Kitchin’s research article highlights the extensiveness of Big Data and how as of recently, machine learning can “computationally and automatically mine and detect patterns and build predictive models and optimize outcomes.” This ability for massive amounts of data to reveal truths about our world without testing a theory is leading people to consider this a new paradigm of science. Kitchin cites Jim Gray in the article as saying that this new paradigm is an extension of the scientific method, while others believe this new paradigm of research no longer requires a theory to reveal truths and therefore is not connected to the scientific method. This new way of researching is undoubtedly effective, as shown with the example of the retail store (on page 4 of the article) increasing their revenue by 16% in the first month by using machine learning on 12 years worth of transaction data.
However, Kitchin makes sure to highlight that some of the appeals of Big Data results are based on fallacies. He says that there is sampling bias within the collection of the data, and that the systems used for collecting data in the first place have been created using scientific reasoning and testing. So while the new information and correlations gathered from Big Data might seem like they are removed from the traditional scientific method or thinking, they actually come from algorithms which were “scientifically tested for validity,” according to Kitchin. Other fallacies which Kitchin highlights include that the results from Big Data are not objective and free from bias, because data results are always framed by humans. So naturally, there will be inherent bias from the presenter. Additionally, the results of Big Data cannot be examined proficiently by just anyone who can read the code, because subject experts are better at understanding the results. Lastly, as is said famously in psychology, correlation is not causation. While useful in advertising and business, scientific correlations can be “random in nature and have no or little casual association.”
Next, Kitchin describes data-driven science, where there is somewhat of a compromise between wholeheartedly accepting using Big Data’s findings while throwing out the need for theories, and traditional science. Data driven science is to be used to assist researchers to use data to “reveal information which will be of potential interest and is worthy of further research.” Instead of wasting resources looking at all patterns found, only the ones that are deemed relevant to researchers’ work will have attention focused upon it. In my opinion, this allows the best of both worlds, correlations between variables can be used to help people understand why things are happening, in order to solve problems more effectively (similar to my opinion of Anderson’s article).
While the use of Big Data and quantitative approaches in the fields of digital humanities and computational social sciences can be useful, it should be recognized that there are limitations from this data for understanding human life. These limitations, “should be recognized and complemented with other approaches,” to better understand life, according to Kitchin. It seems as though Big Data is currently unable to conduct ‘close readings’, and so it lacks the ability to quantify the value of literature (by being limited to only providing surface level analysis). This makes big data less useful in the areas of humanities, as well as the social sciences, because of the level of human’s unpredictability.
Overall, Kitchin has written an important article about his opinion of the best ways that Big Data can currently be used. He has highlighted how Big Data can show us numerous correlations and patterns that wouldn’t have been previously possible, but also tells us that there are biases and shortcomings included in the results from Big Data. I agreed with the section of the article where Kitchin went into data-driven science, where we can take the usefulness of Big Data and use it efficiently to solve issues in our lives. I also liked how he identified a gap in the usefulness of Big Data in terms of digital humanities and social sciences, and how he tried to address this gap by giving a possible way for it to be used using GIS and radial statistics.