Abstract: Big data analytics is the process of examining large amounts of data of a variety of types (big data) to uncover hidden patterns, unknown correlations, and other useful information. Its revolutionary potential is now universally recognized. Data complexity, heterogeneity, scale, and timeliness make data analysis a clear bottleneck in many biomedical applications, due to the complexity of the patterns and lack of scalability of the underlying algorithms. Advanced machine learning and data mining algorithms are being developed to address one or more challenges listed above. It is typical that the complexity of potential patterns may grow exponentially with respect to the data complexity, and so is the size of the pattern space. To avoid an exhaustive search through the pattern space, machine learning and data mining algorithms usually employ a greedy approach to search for a local optimum in the solution space, or use a branch-and-bound approach to seek optimal solutions, and consequently, are often implemented as iterative or recursive procedures. To improve efficiency, these algorithms often exploit the dependencies between potential patterns to maximize in-memory computation and/or leverage special hardware for acceleration. These lead to strong data dependency, operation dependency, and hardware dependency, and sometimes ad hoc solutions that cannot be generalized to a broader scope. In this talk, I will present some open challenges faced by data scientist in biomedical fields and the current approaches taken to tackle these challenges.