Research Interests

Chemical reaction network

Chemical reaction network (CRN) is now a rather active field studying the kinetics and the thermodynamics of a system consisting of multiple species and reactions. Using statistical physical and chemical physical tools, people are now trying to understand these two aspects theoretically. Moreover, the bridge between these two angles is always an interesting topic to investigate. However, the rigorous establishment of the proposed laws, the deeper insights into the potential mechanism as well as how these basic laws build up and formalize the phenomenon we see in the lab is still lacking. Recent developments mainly focus on the relation between the macroscopic model and the microscopic model since the advent of the quantum mechanics. But as the microscopic model will introduce too much variables. A moderate version is always preferred, then mesoscopic model becomes the link between these two models. This is where probability and multiscale theory emerge and can be applied to make the theory more rigorous and complete.

Uncoupled regression (COSLIR)

Instead of studying and analyzing coupled data, in practical study we usually need to cope withuncoupled data, that is, when labels and features are not paired together. Instead, you only have a set consisting of features and a set consisting of labels. The question is, can you still get some inference or information from such a dataset? Such a situation arises naturally in practical research, originally proposed in sociology and economic study, when the data you obtain is from two separate institute and you want to know whether there's a relation between these two factors. Sometimes, you can get data with the same individuals. In this case, you just need to induce a permutation in the indices of your data and pairing it. This is a relative simple and well-studied field, I think (just solve an optimization problem with permutation as a parameter). And there's much paper studying it. But the case we confronted is worse than this. We are not even assured that the data are from the same individuals. Actually we are sure they cannot be from the same individuals. In other words, we just obtain data from the two institute independently or we are just sampling from two distributions and we want to determine the relation between these two distributions. The motivation to study such a problem is from biology, where after each measurement the cells are dead and you cannot get coupled data. One possible solution, which is called covariance restricted sparse linear regression (COSLIR), tends to solve quadratic equations concerning the relation between two distributions’ mean and variance, with a l1-penalty to derive a sparse result. The result is good for sample with small size (hundreds of dimensions), but the it's rather inefficient when the dimension generalizes to thousands of dimension. Our goal now is to try to use sampling methods recently in optimization theory to help us solve high dimensional COSLIR.

Gene regulatory network inference

Actually the above theory is our first trial to try to analyze the gene regulatory network (GRN) merely from the scRNA-seq data. We also want to discover more from the data and use the scRNA-seq data to help us determine the GRN ..