Ethics and Political Correctness in Algorithms

Recently I read an article regarding ethics in data science. The ethics here is not about plagiarism, disclosure of confidential data, or dishonesty, but the decision in designing a model with the consideration of ethics. This sparked my thinking without any conclusions. A lot of countries have a long and painful history of racism. In

Computer Science Eclipsing Funding for Statisticians

The current trend of data science makes a collection of algorithms, known as machine learning, to be the "golden key" of all numerical problems. I used double quotes because I know that it cannot do everything. A lot of these algorithms are optimization problems. And many of them are related to statistics, for example, hidden

Computational Journalism

We "sensed" what has been the current hot issues in the past (and we still often do today.) Methods of "sensing," or "detecting", is now more sophisticated however as the computational technologies are now more advanced. The methods involved can be collected to a field called "computational journalism." Recently, there is a blog post by

Core Competencies of Data Science Education

What should a data scientist know? What are the core skills of a data scientist? I have not seen another job title so vague and ambiguous that arouses so many debates and discussions. BD2K (Big Data to Knowledge) Centers of NIH (National Institutes of Health) [Ohno-Machado 2014] have issued funding to a few tertiary colleges

Statistics Nowadays

There is no doubt that everyone who are in the so-called big data industry must know some statistics. However, statistics means differently to different peoples. Traditional Statistics Statistics is an old field that was developed in theĀ 18th century. In those times, people were urged to make conclusions out of a vast amount of data which

Dream of Automation

It is a fantasy for a lot of entrepreneurs, scientists and engineers to develop a software project that can automatically perform feature generation, training, and prediction automatically. Of course it is a wishful thinking. There is no free lunch. In big companies that have abundant resources (training data, brains, clusters), they can probably so something